204
Bioinformatics of the Brain
After the identification of differentially expressed genes we might need to
understand if or which groups of genetic data have similar expression pat-
terns. Here we can use tools that offer clustering methods such as k-means or
hierarchical clustering. In k-means clustering each individual in the group is
placed in the cluster where it has a mean value that is closest to the cluster’s
mean value, and there are k number of clusters [28]. It is the most widely
used algorithm in data mining. In hierarchical clustering nodes are compared
with one another based on their similarity [29]. Biological networks such as
gene interaction networks enable us to comprehend collective patterns that
would not be possible when examining them individually. Although there
are many tools available for visualizing biological networks, Cytoscape still
emerges as the most popular one. Cytoscape facilitates network analysis as
well as the visualization of network interactions such as gene, protein, and
miRNA [30]. Data from other sources, such KEGG, is also incorporated by
this tool. Enrichment analysis tools can be used to enrich the data, or in other
words understand which phenotype a group of genetic data is associated with.
These tools make use of databases such as The Cancer Genome Atlas (TCGA)
[31], Kyoto Encyclopedia of Genes and Genomes (KEGG) [32], Gene Ontol-
ogy, and PANTHER [33]. Here the above-mentioned databases and others are
used to identify common biological functions, signaling pathways and inter-
actions networks and more. A widely used enrichment analysis tool is Gene
Set Enrichment Analysis: (GSEA) [34]. Another enrichment analysis tool in
use is The Database for Annotation, Visualization and Integrated Discovery
(DAVID) (https://david.ncifcrf.gov/home.jsp) [35].
There are also databases for storing the discoveries made through tech-
niques mentioned here and/or others.
The Genetic Association Database
(GAD) is a repository of information from genetic association studies that
have been published, in which the data and metadata presented in each study
have been structured into a common format [36]. An extensive collection of hu-
man genes and genetic features can be found in the OMIM (Online Mendelian
Inheritance in Man) database [37]. DisGeNET is a database that compiles
data on human gene-disease and variant-disease relationships from numerous
sources, including GAD and OMIM [38]. Figure 8.2 also displays the number
of genes retrieved from DisGeNET that are connected to the diseases and
disorders under study.
8.5
Bioinformatics Studies on Brain Diseases and
Disorders
There are numerous experiments accomplished since the emerge of microar-
ray and RNA-seq technologies. Accordingly, only recent studies involving